Growing Object-Oriented Software vs what I would do
TL;DR: I review Growing Object-Oriented Software, Guided By Tests and contrast it with my personal approach to developing software, explaining my reasoning, and making some comments on the book, OOP and software engineering in general.
Introduction
There are two reasons for this post:
First, I have been programming for many years now (I started programming when I was 7 years old), and I went through a lot of phases, styles, programming languages. At a certain point, I have established an approach that I don't have a name for but has been working for me very well and I'd like to share it.
Second, I'm looking to gain more confidence in my criticism and understanding of OOP. In the past, I have published multiple posts criticizing Object Oriented Programming (just look at #oop tag). OOP in C++ & Java, was one of the phases I went through and looking past at my approach, the code I produced, design of my program, I feel a distaste. However, for most of my carrier, I was working in low-level systems software, kernel, and embedded programming, which resulted in using structured, procedural, imperative programming most of the time. Since my OOP was mostly self-taught by reading books, articles, etc. I always feel this anxiety that... maybe there is such a thing as “good OOP”, maybe all the OOP code I wrote, and the OOP code I keep seeing here and there is just “incorrect OOP”. Because of that, I decided to study OOP in more depth and focus on more practical contrasts between OOP projects, and what I would consider “Rational Programming” – unencumbered by dogma, programming ideologies, to the point, and open to employing any good idea and programming style available.
The biggest obstacle to gaining more confidence in my understanding of OOP is ... actually finding examples of Good OOP. How do I know that something is a Good OOP? Is there a definite example that everyone agrees on? I tried to ask around via a blog post. In the process of collecting links and recommendations I've purchased three OOP books (in the order I've read them):
- Elegant Objects by Yegor Bugayenko
- Java OOP Done Right by Alan Mellor
- Growing Object-Oriented Software, Guided By Tests by Steve Freeman and Nat Pryce
Maybe I'll find time to write more detailed posts to discuss each book, but in short:
Elegant Objects turned out to be most ridiculous. The author pretty quickly states as a fact that the problem with procedural programming is maintainability, and OOP is here it save us all. Funny enough, as someone who spent years working with C and structured programming / procedural programming and can contrast it with all the OOP I've seen, I find this claim completely misguided. The lack of any reasonable arguments behind ideas present in the book is defining the book. Among other things, the author states things like:
- Functional Programming is OK, but only has functions, while OOP has objects and methods so is better.
- An object should have only up to 5 public functions because “There's no particular reason; it's just how I feel”
- “Don't document your code; make it clean instead”.
There are many more claims like this, backed “because I tell you so”. That book really both deserves and does not deserve a separate post, but to sum up: it's just a list of unsubstantiated opinions of the author.
Java OOP Done Right was just bland. Typical OOP book, like some I've read in the past. My main complaint is that it's all over the place and that it doesn't try to explain what it means by OOP. The example of class Cat
and class Dog
with public void chase()
in the Collaboration
chapter is just utterly hopeless and confusing. I can't recommend it even if you think OOP is the best thing since the sliced bread.
Thankfully, the last book – Growing Object-Oriented Software turned out to be really good, so that's why I'm going to focus on it.
Reviewing Growing Object-Oriented Software
Even though I still think the OOP is a bad idea and this book did not change my mind about it, I enjoyed reading it and I feel like I've learned something from it. If you're set on learning OOP, I can recommend it, I guess.
The authors multiple times highlight the fact that not everything in your code is an object, and a lot of things are just plain data – one of the things that I keep agonizing about the OOP code I get to see in the wild. This warmed my heart for the book from the very start. Not sure what The Real OOPers like Smalltalkers thinks about it, but it does fit my worldview.
The beginning of the book talks about software practices and starting up a software project. Even if I don't agree with everything said, I've found it reasonable, respectable, and insightful.
Then the book moves to describe TDD & OOP incremental approach to implementing a small but real-like software project. Real or real-like code is exactly what I am looking for!
I've quickly realized that the authors have a concrete idea of what an object means to them. I was confused why their code was always so... “callback-y” and after studying it a little more, I've discovered the reason. I might have missed it, but I don't think they ever explicitly state it. All calls between objects are unidirectional: no public method of an actual object (not a plain data class) returns any value. They are always void
methods. Objects don't “call” each other. They send a message and don't wait for a response. (Well, actually since they actually “send” it via method call, they do wait, but they pretend they don't.
That discovery blew my mind initially. I panicked. “OMG, is this the secret sauce? Is the joke on me? Does everybody else find it so obvious that it's not worth mentioning and I'm the clown that didn't realize it? Does this make the Good OOP?” But I looked back into two previous books, and the code there is not like this. Bah – the authors can't even make a distinction between “plain data” and “objects”, so how could they have objects sending plain data messages. If I'm the clown, so are the others. If this is a secret sauce, there are a lot of clowns around, not just me.
Then I started thinking about it. This approach turns your program into a distributed system. It tries to model something akin to a microservice architecture, where all communication passes through direct message queues. While message queues and microservices are often a great solution to real software business needs, they introduce tons of problems and changes and are hard to get right. In this book, authors put this challenge on themselves in the name of OOP, and additionally want to do it for every object. Intuition tells me this can't end well.
And after finishing the book, I just don't buy it.
Here, let's make a test. Here is the github repository with the code of the project from the Growing Object-Oriented Software book. Go look at it and tell me what does it do. I'll wait.
The code doesn't have comments, but the design and naming are well thought through. When this code/book was recommended to me after some clicking around the project I wasn't able to decipher what does it do. I wrote
By the looks of it, it looks very much reasonable. DI everywhere, objects used for doers and not data, the state expressed with enums. It does however looks somewhat light on state, which is where from my PoV usually problems with OOP happen. It’s hard for me to judge some design decisions well without any comments or the book… So I just went ahead and ordered the book.
Reading it now, I'm surprised how accurate this comment was. Even now, after reading the book, I find navigating around the code painful. Everything is just... so scattered and abstracted away everywhere. Note: I very often jump into codebases of random software projects that I'm unfamiliar with: Open Source or at work, and I am very quick to find my way around all sorts of codebases.
So while I find this code one of the most elegant OOP pieces I've found, I think it's a great example of problems with OOP. No chance I would consider it easy to understand and maintain. For what is does, it's overly complicated and hard to understand.
The TDD story presented in the book is appealing, and I definitely agree that testability is a great indicator of a good design. However I'm not exactly converted to TDD religion just yet, though would be nice to ever just give a 100% zealous TDD a try and check. As things are presented in the book the OOP + TDD combination strikes me as a very elaborated and roundabout way to get to anywhere.
What I would actually do – reimplementing Auction Sniper
The project being implemented during the course of the book is called Auction Sniper and it is in essence an automatic biding-bot. The hypothetical auctions happen over the XMPP protocol (a sign of times when the book was written), the program has a simple one-table-based GUI, allows users to add an auction to bid on up to a certain price, shows the status and result of each auction as a row and then handles XMPP communication with the auction server. I like this example a lot.
After finishing reading the book, I immediately started to think about “how would I approach writing software like this”.
I don't buy this Agile/TDD approach of “let's just iteratively get to a design”. Trust me I do believe inthe power of iterating, but the belief that iterating itself without any upfront design idea leads anywhere, is in my opinion nonsense.
Persistence
When writing a program like this, my first and most important consideration would be data model design:
- how and what will I persist where,
- how and what will I keep in memory.
IMO, data and data model is what defines any software.
Immediately I remind myself that the implementation from the book ignores the problem of persistence completely. If you close that application it loses all the state. I think this is not an accident. This is where things go wrong for OOP really fast. Object–relational impedance mismatch, fancy ORMs that always lead you down... you name it. Source of complete failure of many projects. Is there a book I can read that explains how are the authors of the Auction Sniper are planning to iteratively add persistence and somehow save this distributed object system in a consistent way?
For me, in this case, there are two asynchronous sources of events, and both are important to handle in a way that guarantees not losing any events. Because of that, I'd employ a simplified version of Event Sourcing. Any event would first be appended to a persistent event log, which other components of the system would subscribe and react to, potentially generating new events. To rephrase: every event gets persisted in an ordered log of events right away. This way: we have a global ordering of things that happened, we can't lose any event, we can restore the system state in case it restarted, we can potentially audit/debug/display events relevant to a certain auction.
An event log like this is one of the simplest and most robust communication patterns available. That's basically what Kafka stream will do for you, but it might as well be a simple append-only file or a table in an SQL database. Each entity following an event log must remember and store the position of events it already processed and that's all there is to it.
Actors/threads
This brings us to another important design consideration: decomposing systems into “actors” (things that can work in parallel without sharing data). I immediately see at least the following separate actors:
The UI handling system can work in parallel and just write the user requests to the event log, at the same time subscribing to it to display any updates and notifications in the UI.
The auction communication receiver can receive events and persist them in the event log.
The auction communication sender can follow the event log and send them out.
The bidding engine follows events from both the auction and UI and reacts to them, potentially producing events that UI or auction sender can act on.
All actors communicate via a shared event log, which simplifies things a lot. In real software, there would be some considerations with respect to performance, amount of data, etc. Maybe one event log would impose too much overhead on all actors following it and splitting it into dedicated logs would be beneficial. Maybe persisting the current state of the bidding engine is necessary – restoring it from a log-on start might be too slow. Snapshots of the state of each auction might make this unnecessary and so on. For now, the simplest design that we can improve on is good enough.
General design
The event log is abstracted away behind two interfaces: one for writers, one for readers. That allows multiple implementations of how actually the data is stored. And since the log is the main IO for all other components, it will be useful to fake it in tests, driving actors independently.
At the start, the main thread initializes the event log resource and starts all actor resources mentioned before, passing event log resources to them as a dependency (Dependency Injection).
UI thread just handles the UI and event log, keeping them in sync. Simple in essence, though details might be hairy. Nowadays the UI would probably consist of an HTTP server, handling requests from the frontend, and maybe some websocket connection streaming relevant event log entries the other way.
Both auction communication threads just receive, translate and write events between the actual protocol and event log.
The bidding engine on start loads its state from the log, and then is basically a loop reacting to relevant events from the event log, and using logic written in functional programming style to decide what action to take, by writing it to the log. A typical “functional code in an imperative shell” loop, really.
Actors following the event stream will need to persist their “cursor” (position in the log) somewhere, which would be an additional resource (interface) backed by an actual database.
Summary
And that's all. One could consider the Event Log and the actors/threads objects, and I guess that would be fair.
However, I would like to point out important differences between this approach and OOP that I keep seeing.
Data architecture is the most important consideration tackled upfront as a first priority. There is no writing anything before the data model that fits all the very high-level requirements is agreed on.
These “objects” are very coarse-grained and used only as high-level components. Internally, they are implemented using a combination of functional (where possible) and procedural/imperative (where necessary) programming. They might all be as well be separate services in a microservice-based architecture. The whole high-level approach I've described before in another post.
I have a rough sketch of the code already written in Rust and is fairly straightforward, but as with everything in software – it takes more time to finish than I wish, and I'm not sure if I'll be motivated enough to wrap it up. Just writing this post took me more than 3 hours, plus reading these books, etc. And all in all – does anyone really care? I mean I hope someone does. I hope that either I saved someone a couple of years fighting with stupid OOP dogma and discovering that it doesn't work, showing a more pragmatic approach, or that someone will understand me better and will send me an email explaining what am I missing that makes OOP actually worth trying to get right. But practically speaking this is just another random blogpost that doesn't change anything. ¯\(ツ)/¯
Feedback
- https://enterprisecraftsmanship.com/posts/growing-object-oriented-software-guided-by-tests-without-mocks/ – another review of the same book, focused on attempting to improve the design using DDD. It looks to me very much like a design I am aiming at, with an exception
AuctionSniperViewModel
being in essence a persistent event log, andUI
,XMPP
andAuctionSniper
running as threads/actors communicating via event log.